library(dplyr)
library(sf)
library(ggplot2)
library(osmdata)
library(leaflet)
Open Street Map (OSM) is a collaborative project which aims at mapping the world and sharing geospatial data in an open way. Anyone can contribute, by mapping geographical objects their encounter, by adding topical information on existing map objects (their name, function, capacity, etc.), or by mapping buildings and roads from satellite imagery (cf. HOT: Humanitarian OpenStreetMap Team).
This information is then validated by other users and eventually added to the common “map” or information system. This ensures that the information is accessible, open, verified, accurate and up-to-date.
The result looks like this: The geospatial data underlying this
interface is made of geometrical objects (i.e. points, lines, polygons)
and their associated tags (#building #height, #road #secondary #90kph,
etc.).
The first thing to do is to define the area within which you want to
retrieve data, aka the bounding box. This can be defined easily
using a place name and the function getbb() from the
package osmdata.
“This function uses the free Nominatim API provided by OpenStreetMap to find the bounding box (bb) associated with place names.”
We are going to look at Brielle together, but you can also work with the small cities of Naarden, Geertruidenberg, Gorinchem, Enkhuizen or Dokkum.
/! This might not be needed, but ensures that your machine knows it has internet…
assign("has_internet_via_proxy", TRUE, environment(curl::has_internet))
bb <- getbb('Brielle', format_out = 'sf_polygon')
bb
Simple feature collection with 2 features and 0 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -74.08588 ymin: 40.08917 xmax: 4.229222 ymax: 51.93141
Geodetic CRS: WGS 84
geometry
1 POLYGON ((4.135294 51.90795...
2 POLYGON ((-74.08588 40.0979...
Because there are different responses from the API query, corresponding to different objects at the same location, or different objects are different locations. Here, we have two options: Brielle (Netherlands) and Brielle (New Jersey)
You can therefore try to be more precise by adding a country code or district name.
bb <- getbb('Brielle, NL', format_out = 'data.frame')
bb
place_id
1 308228079
licence
1 Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright
osm_type osm_id boundingbox lat
1 relation 2711001 51.8835002, 51.9314095, 4.1352938, 4.2292222 51.901662
lon display_name class
1 4.1626292 Brielle, Voorne aan Zee, Zuid-Holland, Nederland boundary
type importance
1 administrative 0.5957129
icon
1 https://nominatim.openstreetmap.org/ui/mapicons/poi_boundary_administrative.p.20.png
A feature in
the OSM language is a category or tag of a geospatial object. Features
are described by general keys (e.g. “building”, “boundary”, “landuse”,
“highway”), themselves decomposed into sub-categories (values) such as
“farm”, “hotel” or “house” for buildings, “motorway”,
“secondary” and “residential” for highway. This determines
how they are represented on the map.
x <- opq(bbox = bb) %>%
add_osm_feature(key = 'building')%>%
osmdata_sf()
What is this x object made of? It is a table of all the buildings contained in the bounding box, which gives us their OSM id, their geometry and a range of attributes, such as their name, building material, building date, etc. The completion level of this table depends on user contributions and open resources (here for instance: BAG, different in other countries).
head(x$osm_polygons)
Simple feature collection with 6 features and 46 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 4.135967 ymin: 51.904 xmax: 4.150993 ymax: 51.91276
Geodetic CRS: WGS 84
osm_id name addr:city addr:housename addr:housenumber addr:postcode
51054695 51054695 <NA> <NA> <NA> <NA> <NA>
51054698 51054698 <NA> <NA> <NA> <NA> <NA>
51054719 51054719 <NA> <NA> <NA> <NA> <NA>
51054751 51054751 <NA> <NA> <NA> <NA> <NA>
51054849 51054849 <NA> <NA> <NA> <NA> <NA>
51054858 51054858 <NA> <NA> <NA> <NA> <NA>
addr:street alt_name amenity architect bridge:support building
51054695 <NA> <NA> <NA> <NA> <NA> yes
51054698 <NA> <NA> <NA> <NA> <NA> yes
51054719 <NA> <NA> <NA> <NA> <NA> yes
51054751 <NA> <NA> <NA> <NA> <NA> yes
51054849 <NA> <NA> <NA> <NA> <NA> yes
51054858 <NA> <NA> <NA> <NA> <NA> yes
building:levels building:material building:name construction
51054695 <NA> <NA> <NA> <NA>
51054698 <NA> <NA> <NA> <NA>
51054719 <NA> <NA> <NA> <NA>
51054751 <NA> <NA> <NA> <NA>
51054849 <NA> <NA> <NA> <NA>
51054858 <NA> <NA> <NA> <NA>
dance:teaching denomination description emergency fixme golf heritage
51054695 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054698 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054719 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054751 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054849 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054858 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
heritage:operator historic image layer leisure level location
51054695 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054698 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054719 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054751 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054849 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
51054858 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
man_made natural old_name operator ref:bag ref:bag:old ref:rce
51054695 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
51054698 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
51054719 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
51054751 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
51054849 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
51054858 storage_tank <NA> <NA> <NA> <NA> <NA> <NA>
religion roof:levels roof:shape source source:date start_date
51054695 <NA> <NA> <NA> 3dShapes <NA> <NA>
51054698 <NA> <NA> <NA> 3dShapes <NA> <NA>
51054719 <NA> <NA> <NA> 3dShapes <NA> <NA>
51054751 <NA> <NA> <NA> 3dShapes <NA> <NA>
51054849 <NA> <NA> <NA> 3dShapes <NA> <NA>
51054858 <NA> <NA> <NA> 3dShapes <NA> <NA>
website wikidata wikipedia geometry
51054695 <NA> <NA> <NA> POLYGON ((4.136178 51.90762...
51054698 <NA> <NA> <NA> POLYGON ((4.136317 51.90752...
51054719 <NA> <NA> <NA> POLYGON ((4.14001 51.90721,...
51054751 <NA> <NA> <NA> POLYGON ((4.142694 51.90404...
51054849 <NA> <NA> <NA> POLYGON ((4.147362 51.91272...
51054858 <NA> <NA> <NA> POLYGON ((4.150993 51.91204...
For instance: the building age focusing on post-1900 buildings.
First, we are going to select the polygons and reproject them with the Amersfoort/RD New projection, suited for maps centred on the Netherlands. This code for this projection is: 28992.
buildings <- x$osm_polygons %>%
st_transform(.,crs=28992)
Then we create a variable which a threshold at 1900. Every date prior to 1900 will be recoded 1900, so that buildings older than 1900 will be represented with the same shade.
Then we use the ggplot function to visualise the
buildings by age. The specific function to represent information as a
map is geom_sf(). The rest works like other graphs and
visualisation, with aes() for the aesthetics.
buildings$build_date <- as.numeric(
if_else(
as.numeric(buildings$start_date) < 1900,
1900,
as.numeric(buildings$start_date)
)
)
ggplot(data = buildings) +
geom_sf(aes(fill = build_date, colour=build_date)) +
scale_fill_viridis_c(option = "viridis")+
scale_colour_viridis_c(option = "viridis")
So this reveals the historical centre of [city] and the various extensions. Anything odd? what? around the centre? Why these limits / isolated points?
Leaflet (15 min)sf packagesf?sf is a package which supports simple features (sf), “a standardized
way to encode spatial vector data.”. It contains a large set of
functions to achieve all the operations on vector spatial data for which
you might use traditional GIS software: change the coordinate system,
join layers, intersect or unit polygons, create buffers and centroids,
etc. cf. the sf cheatsheet.
Let’s focus on old buildings and imagine we’re in charge of their conservation. We want to know how much of the city would be affected by a non-construction zone of 100m around pre-1800 buildings.
Let’s select them and see where they are.
old <- 1800 # year prior to which you consider a building old
summary(buildings$start_date)
Length Class Mode
8530 character character
buildings$start_date <- as.numeric(as.character(buildings$start_date))
old_buildings <- buildings %>%
filter(start_date <= old)
ggplot(data = old_buildings) + geom_sf(colour="red")
As conservationists, we want to create a zone around historical
buildings where building regulation will have special restrictions to
preserve historical buildings.
Let’s say this zone should be 100 meters. In GIS terms, we want to
create a buffer around polygons. The corresponding function
sf is st_buffer, with 2 arguments: the
polygons around which to create buffers, and the radius of the
buffer.
distance <- 100 # in meters
buffer_old_buildings <-
st_buffer(x = old_buildings, dist = distance)
ggplot(data = buffer_old_buildings) + geom_sf()
Now, we have a lot of overlapping buffers. We would rather create a
unique conservation zone rather than overlapping ones in that case. So
we have to fuse the overlapping buffers into one polygon. This operation
is called union and the corresponding function is
st_union.
single_old_buffer <- st_union(buffer_old_buildings) %>%
st_cast(to = "POLYGON") %>%
st_as_sf()
single_old_buffer<- single_old_buffer %>%
mutate("ID"=as.factor(1:nrow(single_old_buffer))) %>%
st_transform(.,crs=28992)
We also use st_cast() to explicit the type of the
resulting object (POLYGON instead of the default
MULTIPOLYGON), st_as_sf() to transform the polygon
into an sf object.
For the sake of visualisation speed, we would like to represent
buildings by a single point (for instance: their geometric centre)
rather than their actual footprint. This operation means defining their
centroid and the corresponding function is st_centroid.
sf::sf_use_s2(FALSE)
centroids_old <- st_centroid(old_buildings) %>%
st_transform(.,crs=28992)
ggplot() +
geom_sf(data = single_old_buffer, aes(fill=ID)) +
geom_sf(data = centroids_old)
Now what we would like to distinguish conservation areas based on the
number of historic buildings they contain. In GIS terms, we would like
to know how many centroids each fused buffer polygon contains. This
operation means intersecting the layer of polygons with the layer of
points and the corresponding function is
st_intersection.
centroids_buffers <-
st_intersection(centroids_old,single_old_buffer) %>%
mutate(n=1)
centroid_by_buffer <- centroids_buffers %>%
group_by(ID) %>%
summarise(n = sum(n))
single_buffer <- single_old_buffer %>%
mutate(n_buildings = centroid_by_buffer$n)
ggplot() +
geom_sf(data = single_buffer, aes(fill=n_buildings)) +
scale_fill_viridis_c(alpha = 0.8,
begin = 0.6,
end = 1,
direction = -1,
option = "B")
Let’s map this layer over the initial map of individual buildings.
ggplot() +
geom_sf(data = buildings) +
geom_sf(data = single_buffer, aes(fill=n_buildings), colour = NA) +
scale_fill_viridis_c(alpha = 0.6,
begin = 0.6,
end = 1,
direction = -1,
option = "B")
old <- 1939
distance <- 10
#select
old_buildings <- buildings %>%
filter(start_date <= old)
#buffer
buffer_old_buildings <- st_buffer(old_buildings, dist = distance)
#union
single_old_buffer <- st_union(buffer_old_buildings) %>%
st_cast(to = "POLYGON") %>%
st_as_sf()
single_old_buffer <- single_old_buffer %>%
mutate("ID"=1:nrow(single_old_buffer)) %>%
st_transform(single_old_buffer,crs=4326)
#centroids
centroids_old <- st_centroid(old_buildings) %>%
st_transform(.,crs=4326)
#intersection
centroids_buffers <- st_intersection(centroids_old,single_old_buffer) %>%
mutate(n=1)
centroid_by_buffer <- centroids_buffers %>%
group_by(ID) %>%
summarise(
n = sum(n)
)
single_buffer <- single_old_buffer %>%
mutate(n_buildings = centroid_by_buffer$n)
ggplot() +
geom_sf(data = buildings) +
geom_sf(data = single_buffer, aes(fill=n_buildings), colour = NA) +
scale_fill_viridis_c(alpha = 0.6,
begin = 0.6,
end = 1,
direction = -1,
option = "B")
Problem: there are many pre-war buildings and the buffers are large so the number of old buildings is not very meaningful. Let’s compute the density of old buildings per buffer zone.
single_buffer$area <- st_area(single_buffer, ) %>% units::set_units(., km^2)
single_buffer$old_buildings_per_km2 <- as.numeric(single_buffer$n_buildings / single_buffer$area)
ggplot() +
geom_sf(data = buildings) +
geom_sf(data = single_buffer, aes(fill=old_buildings_per_km2), colour = NA) +
scale_fill_viridis_c(alpha = 0.6,
begin = 0.6,
end = 1,
direction = -1,
option = "B")
We have seen how OpenStreetMap (OSM) geodata works and how to import, select, and visualise OSM vector data. We have seen how to create spatial buffers and centroids, how to intersect vector data and how retrieve the area of polygons.
In short:
osmdata, ggplot and
sf packagesst_* functions for basic GIS operations